Method and Apparatus to Perform Optimal Visually-Weighed Quantization of Time-Varying Visual Sequences in Transform Space

ABSTRACT

Pure transform-based technologies, such as the DCT or wavelets, can leverage a mathematical model based on few or one parameters to generate the expected distribution of the transform components&#39; energy, and generate ideal entropy removal configuration data continuously responsive to changes in video behavior. Construction of successive-refinement streams is supported by this technology, permitting response to changing channel conditions. Lossless compression is also supported by this process. The embodiment described herein uses a video correlation model to develop optimal entropy removal tables and optimal transmission sequence based on a combination of descriptive characteristics of the video source, enabling independent derivation of said optimal entropy removal tables and optimal transmission sequence in both encoder and decoder sides of the compression and playback process.

FEDERALLY SPONSORED RESEARCH

Not Applicable.

PATENT CASE TEXT

This application claims benefit of a prior filed U.S. provisional application Ser. No. Ser. 61/818,423, filed May 1, 2013.

BACKGROUND

1. Field of Invention

The present invention relates generally to compression of still image and moving video data, and more particularly to the application of calculation of statistics of the behavior of the quantized transform representation of the video from the measured variances and the measured correlations in pixel space. Once the statistical behavior of the video is modeled, the video streams can be collected into successive refinement streams (progressive mode), and the probabilities of the constructed symbols can be calculated for the purposes of entropy removal. The measured variances and correlations suffice to reconstruct the compressed video streams for any frame or group of frames.

2. Description of Prior Art

As depicted in FIG. 1, most prior-art still image and motion video compression algorithms perform a similar sequence of steps: an input stream 1010 is transform coded 1020, after which a process of motion estimation 1030 followed by equal-weight quantization 1040 or a process of visually-weighted quantization 1050 takes place, the resulting data is sequenced into transmission order 1060, symbols are collected 1070, and an entropy removal step 1080 results in a compressed data stream 1090. The essential innovation of the current invention is in the area of prediction of statistical behavior, which influences the process of transmission order sequence, symbol collection, and entropy removal. The current invention does not address the topic of motion estimation.

The JPEG zig-zag transmission order illustrated in FIG. 2 is a standard prior-art means of sequencing quantized coefficients into transmission order using a fixed pattern based on the average of statistics collected across a variety of sample content. The JPEG zig-zag order is a simple pattern which orders coefficients roughly into the order of increasing probability of zero. The JPEG zig-zag order is applied to one fixed-size block of the image at a time.

As depicted in FIG. 3, the prior-art JPEG-2000 standard implements various forms of progressive transmission, including spectral selection (FIG. 3 a), successive refinement (FIG. 3 b), and hierarchical (FIG. 3 c). FIG. 3 a depicts a plurality of two 8×8 quantized transform blocks 3010 covered by a plurality of three spectral bands, of which spectral band 3020 is typical. Data is collected within the band across all blocks 3030 and symbols are collected from the data within each band, which is then entropy coded and transmitted.

FIG. 3 b depicts a plurality of two 8×8 quantized transform blocks represented by a typical 2×2 entry 3110. The 2×2 entry of eight-bit numbers is divided into two successive refinement bands of four-bit representation, one of which is depicted 3120. The first four bits are collected across transform blocks into a transmission stream 3130 from which symbols will be collected and entropy coding will take place. The second four bits are similarly collected into transmission stream 3140.

FIG. 3 c depicts a first-transmitted low-resolution image 3210, followed by a second-transmitted medium-resolution image 3220, and a final high-resolution image 3230. Each separate-resolution image is used to create its own transmission stream.

FIG. 4 depicts typical prior-art means of communicating entropy encoding statistics between compressing and decompressing apparatuses. It should be noted that these entropy statistics may be represented directly as a table of relative probabilities for the purposes of arithmetic encoding, or as Huffman tables.

The original JPEG specification provides for a pre-defined entropy pre-shared encoding table as depicted in FIG. 4 a. A preshared table 4020 is known to compressor 4010, and is used to generate the compressed data stream 4030. The preshared table 4050 known to the decompressor 4040 is used to decompress the received data stream. Pre-shared tables are intended to provide good compression based on the collection of statistics for a large collection of images.

As illustrated in FIG. 4 b, tables may be dynamically calculated and embedded in the transmission stream. A compressor 4110 calculates a table 4120, which it then transmits in-band 4130 in the compressed transmission stream 4140. The decompressor 4150 reads an in-band table 4160 and uses it to decompress the following compressed stream. This strategy enables better compression at the overhead cost of hundreds to thousands of bytes.

Each JPEG-2000 progressive transmission approach described above in FIG. 3 requires assembly of symbols over each progressive decomposition step (spectral band, successively bit representation, or resolution), giving different symbols and symbol distributions. As depicted in FIG. 4 c, a JPEG-2000 compressor 4210 calculates up to four entropy coding tables 4220 which it then transmits in-band 4230 in the compressed transmission stream 4240. The decompressor 4250 reads the in-band tables 4260 and uses them, as selected by each progressive stream, to decompress the following compressed data. If the tables are calculated to reflect typical progressive stream behavior, the tables may potentially be reusable.

Much effort has been expended in the incremental increase of efficiency in the communication of entropy coding statistics between compressing and decompressing apparatuses, but no significant advances can be claimed over the prior-are techniques described herein. The current invention discloses a far more efficient means of developing entropy tables independently in compressor and decompressor.

SUMMARY OF INVENTION

In accordance with one aspect of the invention, a method is provided for the optimal rearrangement of components into a transmission stream based on the calculated variance of individual quantized transform components from the measured variance and correlation of the raw untransformed visual samples.

A second aspect of the invention provides a method for the optimal calculation of entropy reduction tables for a transmission stream based on the calculated symbol probabilities based on the calculated probability distributions of individual quantized transform components.

A final aspect of the invention provides a method for the parallel construction of transmission stream rearrangement, symbol construction and entropy tables between compressing apparatus and decompressing apparatus via communication of the measured variances and correlations of the raw untransformed visual samples.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 depicts a prior-art compressor decomposed into the steps of transformation, quantization, transmission order sequencing, symbol collection, and entropy removal.

FIG. 2 depicts a prior-art compressor featuring per-block transmission order sequencing.

FIG. 3 depicts a prior-art compressor featuring three forms of progressive transmission order encoding; spectral selection, successive refinement, and hierarchical.

FIG. 4 depicts a prior-art compressor featuring various means of communication of entropy coding tables; pre-shared, in-band, and multiple tables.

FIG. 5 depicts a typical embodiment of the current invention into a compression apparatus and a decompression apparatus.

FIG. 6 depicts the steps typically required of a compression unit in order to perform block-by-block compression.

FIG. 7 illustrated hierarchical subband decomposition and compression.

FIG. 8 illustrates the calculations required to model per-quantized transform component variance from pixel variance and pixel correlation.

FIG. 9 illustrates the calculations required to predict per-symbol probabilities from quantized transform component variances.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

FIG. 5 depicts a compression apparatus 5010 and a decompression apparatus 5020. Said compression apparatus 5010 is fed a sequential stream of visual data 5110, and factors said sequential stream of visual data 5110 into a plurality of multidimensional subblocks 5120. Said plurality of multidimensional subblocks 5120 is processed singly or jointly by a correlation measurement unit 5130 to produce a flow of measured variance values and measured correlation values to the decompression apparatus 5210 and a duplicate flow of measured variance values and measured correlation values to a compressor 5140. A compression unit 5150 uses said duplicate flow of measured variance values and measured correlation values to a compressor 5140 and said plurality of multidimensional subblocks 5120 to produce a compressed stream to the decompression apparatus 5220. Said decompression apparatus 5020 is compressed of a decompressor 5310 which processes said flow of measured variance values and measured correlation values to the decompression apparatus 5210 and said compressed stream to the decompression apparatus 5220 to produce a plurality of reconstructed multidimensional subblocks 5320.

FIG. 6 depicts a decomposition of said compression apparatus 6010 into typical processing steps used to perform individual block-by-block compression. Said flow of measured variance values and measured correlation values to the decompression apparatus 5210 results in a set of variance values and correlation values in the x, y and z directions valid for one subblock 6110 of said plurality of multidimensional subblocks. Said set of kmeasured variance values and measured correlation values in the x, y and z directions valid for one subblock 6020 is processed through a step 6030 which calculates the variances for said quantized transform components of said one subblock 6110. In a further processing step 6040, said calculated variances for said quantized transform components of said one subblock from said step 6030 is used to calculate relative probabilities for each symbol.

The quantized transform components of said one subblock 6110 of said plurality of multidimensional subblocks processed through a step 6120 to reorder quantized transform components into order of greatest probability of zero (lowest variance). Said step 6120 uses said calculated variances for said quantized transform components from said step 6030 to perform its sort processing.

Said reordered quantized transform components are then processed through a step 6130 of collection of said reordered quantized transform components into symbols. Each said collected symbol is then processed through a step 6140 of entropy coding of said symbol into a short sequence of bits. Said step 6140 uses said calculated relative probabilities for each symbol from said step 6040 in its entropy-removing calculations.

Said short sequence of bits is finally processed through an aggregation step 6150 to concatenate generated bit sequences into a transport stream.

FIG. 7 depicts a typical implementation of the hierarchical type of progressive transmission. A sequential stream of visual data 7010 is subsampled from said sequential stream of visual data 5110. Said subsampled sequential stream of visual data 7010 is factored into a plurality of multidimensional subblocks 7020. Said plurality of multidimensional subblocks 7020 is then processed subblock by subblock by said compression unit 5150 to produce a sequence of compressed bits for transmission.

Once said subsampled sequential stream of visual data 7010 has been processed through said compression unit 5150, a higher-resolution sequential stream of visual data less subband data 7110 may be processed. Said higher-resolution sequential stream of visual data less subband data 7110 is comprised of sequential stream of visual data 5110 where each and every coefficient comprising said subsampled sequential stream of visual data 7010 is set to 0 with a variance of 0. Said higher-resolution sequential stream of visual data less subband data 7110 is factored into a plurality of multidimensional subblocks 7120. Said plurality of multidimensional subblocks 7120 is then processed subblock by subblock by said compression unit 5150 to produce a sequence of compressed bits for transmission.

FIG. 8 illustrates the calculations required to model per-quantized transform component variance from pixel variance and pixel correlation. A matrix DCT_(x) 8010 is comprised of the individual constants of discrete cosine transform convolution. Said matrix DCT_(x) 8010 is shown with the discrete cosine transform of a 4×4 convolution, but may in practice be composed of any orthonormal transform. Similar matrices DCT_(y) and DCT_(z) (in the case of three-dimensional said multidimensional subblocks 7120) will assume the length of each dimension of the said multidimensional subblocks 7020.

A covariance matrix A_(pixel,x) 8020 is composed of the multiplication of said measured pixel variance in the x direction by the autocorrelation matrix derived from said measured pixel correlation in the x direction. Similar matrices A_(pixel,y) and A_(pixel,z) (in the case of three-dimensional said multidimensional subblocks 7120) will utilize the measured pixel variance, pixel correlations and length of each dimension of the said multidimensional subblocks 7020.

DCT covariance matrix A_(x) 8030 is calculated as the product of said matrix DCT_(x) 8010, said covariance matrix A_(pixel,x) 8020, and the transpose of said matrix DCT_(x) 8010.

The variance of the quantized transform component 8040 of index u,v,w within said multidimensional subblocks 7020, σ² _(u,v,w), is calculated as the product of the trace of said DCT covariance matrix A_(x) 8030 with the trace of said DCT covariant matrix A_(y) (and with the trace of said DCT covariant matrix A_(z) if said multidimensional subblocks 7020 are three-dimensional) divided by the quantizer value for said quantized transform component 8040 of index u,v,w within said multidimensional subblocks 7020.

FIG. 9 illustrates the process of calculating symbol probabilities. The maximum number of bits subblock N_(MAX,u,v,w) 9010 required to encode any said quantized transform component of index u,v,w within said quantized transform is calculated as the rounded-up integer of the logarithm base 2 of the product of the number of bits representing each pixel N_(IN), the square root of the product of the lengths of said multidimensional blocks divided by the quantizer Q_(u,v,w) of said quantized transform component.

The probability p_(u,v,w)(x==0) 9020 that any quantized transform component of index u,v,w within said quantized transform subblock is 0 is calculated from the Cumulative Distribution Function of a normal distribution with expectation of 0 and variance equal to that of said quantized transform component of index u,v,w within said quantized transform.

The probability p_(u,v,w)(log₂(x)==n) 9030 that any quantized transform component of index u,v,w within said quantized transform subblock has n bits in its representation is calculated from the Cumulative Distribution Function of a normal distribution with expectation of 0 and variance equal to that of said quantized transform component of index u,v,w within said quantized transform.

A typical symbol S_(u,v,w)(r,b) 9040 comprised of a run length of r zeros followed by a non-zero value of length b is calculated as the conditional probability the each symbol in the order of said rearrangement of said quantized transform component within said quantized transform subblock. The probability of the i^(th) quantized transform component following quantized transform component index u,v,w within said quantized transform subblock being 0 is written p_((u,v,w)+i)(x==0). The probability of the r^(th) quantized transform component following quantized transform component index u,v,w within said quantized transform subblock requiring b bits is written p_((u,v,w)+r)(log₂(x)==b).

Conclusion

While the present invention has been described in its preferred version or embodiment with some degree of particularity, it is understood that this description is intended as an example only, and that numerous changes in the composition or arrangements of apparatus elements and process steps may be made within the scope and spirit of the invention. In particular, rearrangement and recalculation of statistics may be made to support various modes of progressive transmission, including spectral banding or bitwise refinement. Further, pixel statistics may be measured and transmitted on a per-block or global basis, and may be measured in each dimension or averaged across all dimensions. Block sizes may also be taken to be as large as the entire frame, as would be typical when using the wavelet transform.

With regard to the processes, systems, methods, heuristics, etc. described herein, it should be understood that, although the steps of such processes, etc. have been described as occurring according to a certain ordered sequence, such processes could be practiced with the described steps performed in an order other than the order described herein. It further should be understood that certain steps could be performed simultaneously, that other steps could be added, or that certain steps described herein could be omitted. In other words, the descriptions of processes herein are provided for the purpose of illustrating certain embodiments, and should in no way be construed so as to limit the claimed invention.

Accordingly, it is to be understood that the above description is intended to be illustrative and not restrictive. Many embodiments and applications other than the examples provided would be apparent to those of skill in the art upon reading the above description. The scope of the invention should be determined, not with reference to the above description, but should instead be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled. It is anticipated and intended that future developments will occur in the arts discussed herein, and that the disclosed systems and methods will be incorporated into such future embodiments. In sum, it should be understood that the invention is capable of modification and variation and is limited only by the following claims.

All terms used in the claims are intended to be given their broadest reasonable constructions and their ordinary meanings as understood by those skilled in the art unless an explicit indication to the contrary in made herein. In particular, use of the singular articles such as “a,” “the,” “said,” etc. should be read to recite one or more of the indicated elements unless a claim recites an explicit limitation to the contrary. 

1. An apparatus comprised of a compressor and decompressor and a method for generating an optimally compressed representation of multidimensional visual data after transformation by a multidimensional orthogonal transform of a specified transformation block size, after quantization by coefficients of said transformation block size, and after rearrangement of said quantized coefficients into a transmission sequence, and after collection of said quantized transformation coefficients into symbols, by the application of said quantized decorrelating transform to a plurality of measured variances of uncompressed multidimensional visual data and measured correlation coefficients of uncompressed multidimensional visual data to calculate the probability distribution of each quantized transform coefficient required to perform entropy removal,
 2. The method of claim 1 where said orthogonal transform is the discrete cosine transform,
 3. The method of claim 1 where said multidimensional visual data comprises a two-dimensional still image,
 4. The method of claim 3 where said transformation block size comprises the entire image,
 5. The method of claim 3 where said plurality of measured variances of uncompressed multidimensional visual data is one averaged value per block and said plurality of correlation coefficients is one averaged value per frame,
 6. The method of claim 3 where said plurality of measured variances of uncompressed multidimensional visual data is one averaged value per block and said plurality of correlation coefficients is one averaged value per block,
 7. The method of claim 3 where said plurality of measured variances of uncompressed multidimensional visual data is one averaged value per dimension per frame and said plurality of correlation coefficients is one averaged value per dimension per frame,
 8. The method of claim 3 where said plurality of measured variances of uncompressed multidimensional visual data is one averaged value per block and said plurality of correlation coefficients is one averaged value per dimension per block,
 9. The method of claim 1 where said multidimensional visual data comprises a three-dimensional moving video sequence,
 10. The method of claim 9 where said transformation block size comprises a number of frames by the entire size of a single frame,
 11. The method of claim 9 where said plurality of measured variances of uncompressed multidimensional visual data is one averaged value per group of frames and said plurality of correlation coefficients is one averaged value per group of frames,
 12. The method of claim 9 where said plurality of measured variances of uncompressed multidimensional visual data is one averaged value per block and said plurality of correlation coefficients is one averaged value per block,
 13. The method of claim 9 where said plurality of measured variances of uncompressed multidimensional visual data is one averaged value per dimension per group of frames and said plurality of correlation coefficients is one averaged value per dimension per group of frames,
 14. The method of claim 9 where said plurality of measured variances of uncompressed multidimensional visual data is one averaged value per dimension per block and said plurality of correlation coefficients is one averaged value per dimension per block,
 15. The method of claim 1 where said quantizers are all ones,
 16. The method of claim 1 where said quantizers are all equal,
 17. The method of claim 1 where said quantizers are visually weighed,
 18. The method of claim 1 where coefficients are organized within each block into order of decreasing calculated component variance,
 19. The method of claim 18 where the probability of symbols is calculated from a definition of a plurality of symbols as collected from sequences of component values whose conditional expectation is zero followed by the actual non-zero value, a plurality of symbols as collected from sequences of component values whose conditional expectation is zero followed by the number of bits required to represent the non-zero value, an end-of-block symbol whose conditional expectation is calculated from the cumulative probability of a sequence of symbols comprised solely of zeroes, and an escape symbol whose conditional expectation is calculated from the accumulation of the probability of all symbols not otherwise defined.
 20. The method of claim 1 where coefficients are organized across blocks into order of decreasing calculated component variance,
 21. The method of claim 20 where the probability of symbols is calculated from a definition of a plurality of symbols as collected from sequences of component values whose conditional expectation is zero followed by the actual non-zero value, a plurality of symbols as collected from sequences of component values whose conditional expectation is zero followed by the number of bits required to represent the non-zero value, an end-of-block symbol whose conditional expectation is calculated from the cumulative probability of a sequence of symbols comprised solely of zeroes, and an escape symbol whose conditional expectation is calculated from the accumulation of the probability of all symbols not otherwise defined.
 22. The method of claim 1 where coefficients are organized across blocks into bands of decreasing calculated component variance within of order successive refinement,
 23. The method of claim 22 where the probability of symbols is calculated from a definition of a plurality of symbols as collected from sequences of component values whose conditional expectation is zero followed by the actual non-zero value, a plurality of symbols as collected from sequences of component values whose conditional expectation is zero followed by the number of bits required to represent the non-zero value, an end-of-block symbol whose conditional expectation is calculated from the cumulative probability of a sequence of symbols comprised solely of zeroes, and an escape symbol whose conditional expectation is calculated from the accumulation of the probability of all symbols not otherwise defined.
 24. The method of claim 1 where coefficients are organized across blocks into bands of equal weight in order of decreasing calculated component variance,
 25. The method of claim 24 where the probability of symbols is calculated from a definition of a plurality of symbols as collected from sequences of component values whose conditional expectation is zero followed by the actual non-zero value, a plurality of symbols as collected from sequences of component values whose conditional expectation is zero followed by the number of bits required to represent the non-zero value, an end-of-block symbol whose conditional expectation is calculated from the cumulative probability of a sequence of symbols comprised solely of zeroes, and an escape symbol whose conditional expectation is calculated from the accumulation of the probability of all symbols not otherwise defined.
 26. The method of claim 1 where Huffman coding based used to perform entropy removal on the constructed stream of symbols,
 27. The method of claim 26 where said measured variances of uncompressed multidimensional visual data and said measured correlations of uncompressed multidimensional visual data are communicated between compressor and decompressor,
 28. The method of claim 1 where arithmetic coding based is used to perform entropy removal on the constructed stream of symbols,
 29. The method of claim 28 where said measured variances of uncompressed multidimensional visual data and said measured correlations of uncompressed multidimensional visual data are communicated between compressor and decompressor,
 30. The method of claim 1 where said decorrelating transform is any orthonormal wavelet. 